NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Fairness through difference awareness: Measuring desired group discrimination in LLMs

Wang, A; Phan, M; Ho, D E; Koyejo, S (July 2025, Association for Computational Linguistics)

Free, publicly-accessible full text available July 31, 2026
Toward an evaluation science for generative AI systems

Weidinger, L; Raji, I D; Wallach, H; Mitchell, M; Wang, A; Salaudeen, O; Bommasani, R; Ganguli, D; Koyejo, S; Isaac, W (March 2025, National Academy of Engineering: Spring Bridge on AI: Promises and Risks)

Free, publicly-accessible full text available March 31, 2026
DNN Architecture Attacks via Network and Power Side Channels

Dai, Y; Guo, Q; Wang, A (October 2024, Springer, Cham)

Full Text Available
Near-surface turbulence in the Arctic (Utqiagvik, AK, March - April 2022)

https://doi.org/10.26208/9qa7-1e73

Pan, Y; Fuentes, J D; Warner, G_R T; Anderson, A J; Wang, A; Katsouros, M (January 2025, Penn State Data Commons)

This flux-tower observational campaign occurred in Utqiagvik, AK. A 12-m tower was installed in February 2022 to collect turbulence data at a total of five heights (0.5 m, 1.5 m, 2.5 m, 3.5 m, and 7.5 m). At each height, a Campbell Scientific CSAT3B sonic anemometer was operated to measure three velocity components and virtual temperature at 50 Hz, and an R. M. Young temperature and relative sensor was operated to measure air temperature and relative humidity at 1 Hz. The effective data collection was during March--April 2022, until the tower was taken down in April 2022. This was the first dataset of Arctic turbulence collected at 50 Hz, a frequency substantially higher than previous measurements at 10 Hz and 20 Hz. Given the strongly stable conditions in the Arctic, increasing the sampling frequency to 50 Hz was critical to resolve near-surface turbulence within or at least close to the inertial subrange.
more » « less
Does Differential Privacy Impact Bias in Pretrained Language Models?

Islam, MK; Wang, A; Wang, T; Ji, Y; Fox, J; Zhao, J (June 2024, IEEE Data Engineering Bulletin (Special Issue on Privacy-preserving Data Management) Vol. 48 No. 2, June 2024.)
Wang, H; Xiao, X (Ed.)
Differential privacy (DP) is applied when fine-tuning pre-trained language models (LMs) to limit leakage of training examples. While most DP research has focused on improving a model’s privacy-utility tradeoff, some find that DP can be unfair to or biased against underrepresented groups. In this work, we extensively analyze the impact of DP on bias in LMs. We find differentially private training can increase the model bias against protected groups w.r.t AUC-based bias metrics. DP makes it more difficult for the model to differentiate between the positive and negative examples from the protected groups and other groups in the rest of the population. Our results also show that the impact of DP on bias is affected by both the privacy protection level and the underlying distribution of the dataset.
more » « less
Full Text Available
Can Large Language Models Generate Middle School Mathematics Explanations Better than Human Teachers?

Wang, A; Prihar, E; Heffernan, N; Lee, M; Hopman, M; Kalai, A.T; Vempala, S; Wickline, G (January 2024, LAK 2024 (submitted, in review))

The development and measurable improvements in performance of large language models on natural language tasks opens the opportunity to utilize large language models in an educational setting to replicate human tutoring, which is often costly and inaccessible. We are particularly interested in large language models from the GPT series, created by OpenAI. In the original study we found that the quality of explanations generated with GPT-3.5 was poor, where two different approaches to generating explanations resulted in a 43% and 10% successrate. In a replication study, we were interested in whether the measurable improvements in GPT-4 performance led to a higher rate of success for generating valid explanations compared to GPT-3.5. A replication of the original study was conducted by using GPT-4 to generate explanations for the same problems given to GPT-3.5. Using GPT-4, explanation correctness dramatically improved to a success rate of 94%. We were further interested in evaluating if GPT-4 explanations were positively perceived compared to human-written explanations. A preregistered, follow-up study was implemented where 10 evaluators were asked to rate the quality of randomized GPT-4 and teacher-created explanations. Even with 4% of problems containing some amount of incorrect content, GPT-4 explanations were preferred over human explanations.
more » « less
Full Text Available
Adaptive Sampling and Quick Anomaly Detection in Large Networks

Xian, X.; Semenov, A.; Hu, Y.; Wang, A.; Jin, Y. (October 2022, IEEE transactions on automation science and engineering)

The monitoring of data streams with a network structure have drawn increasing attention due to its wide applications in modern process control. In these applications, high-dimensional sensor nodes are interconnected with an underlying network topology. In such a case, abnormalities occurring to any node may propagate dynamically across the network and cause changes of other nodes over time. Furthermore, high dimensionality of such data significantly increased the cost of resources for data transmission and computation, such that only partial observations can be transmitted or processed in practice. Overall, how to quickly detect abnormalities in such large networks with resource constraints remains a challenge, especially due to the sampling uncertainty under the dynamic anomaly occurrences and network-based patterns. In this paper, we incorporate network structure information into the monitoring and adaptive sampling methodologies for quick anomaly detection in large networks where only partial observations are available. We develop a general monitoring and adaptive sampling method and further extend it to the case with memory constraints, both of which exploit network distance and centrality information for better process monitoring and identification of abnormalities. Theoretical investigations of the proposed methods demonstrate their sampling efficiency on balancing between exploration and exploitation, as well as the detection performance guarantee. Numerical simulations and a case study on power network have demonstrated the superiority of the proposed methods in detecting various types of shifts. Note to Practitioners —Continuous monitoring of networks for anomalous events is critical for a large number of applications involving power networks, computer networks, epidemiological surveillance, social networks, etc. This paper aims at addressing the challenges in monitoring large networks in cases where monitoring resources are limited such that only a subset of nodes in the network is observable. Specifically, we integrate network structure information of nodes for constructing sequential detection methods via effective data augmentation, and for designing adaptive sampling algorithms to observe suspicious nodes that are likely to be abnormal. Then, the method is further generalized to the case that the memory of the computation is also constrained due to the network size. The developed method is greatly beneficial and effective for various anomaly patterns, especially when the initial anomaly randomly occurs to nodes in the network. The proposed methods are demonstrated to be capable of quickly detecting changes in the network and dynamically changes the sampling priority based on online observations in various cases, as shown in the theoretical investigation, simulations and case studies.
more » « less
Full Text Available
Electronic-photonic quantum systems on-chip

https://doi.org/10.1364/QUANTUM.2022.QTu4B.3

I. Wang, A. Ramesh (June 2022, Quantum 2.0 Conference and Exhibition)

Full Text Available
Sequential Recommendation via Stochastic Self-Attention

Fan, Z.; Liu, Z.; Wang, Y.; Wang, A.; Nazari, Z.; Zheng, L.; Peng, H.; Yu, P.S. (April 2022, WWW conference)

Full Text Available
A fast X-ray transient from a weak relativistic jet associated with a type Ic-BL supernova

https://doi.org/10.1038/s41550-025-02571-1

Sun, H; Li, W-X; Liu, L-D; Gao, H; Wang, X-F; Yuan, W; Zhang, B; Filippenko, A V; Xu, D; An, T; et al (July 2025, Nature Astronomy)

Free, publicly-accessible full text available July 1, 2026

« Prev Next »

Search for: All records